Multiple annotation for biodiversity: developing an annotation framework among biology, linguistics and text technology
نویسندگان
چکیده
Abstract Biodiversity information is contained in countless digitized and unprocessed scholarly texts. Although automated extraction of these data has been gaining momentum for years, there are still innumerable text sources that poorly accessible require a more advanced range methods to extract relevant information. To improve the access semantic biodiversity information, we have launched BIOfid project ( www.biofid.de ) developed portal semantics German language texts, mainly from 19th 20th century. However, make such work, couple had be or adapted first. In particular, text-technological were needed, which required Such draw on machine learning techniques, turn trained by data. this end, among others, gathered bio corpus, cooperatively built resource, biologists, technologists, linguists. A special feature its multiple annotation approach, takes into account both general biology-specific classifications, means goes beyond previous, typically taxon- ontology-driven proper name detection. We describe design decisions genuine Annotation Hub Framework underlying annotations present agreement results. The tools used create introduced, use described. Finally, some lessons, particular with projects, drawn.
منابع مشابه
A CAD System Framework for the Automatic Diagnosis and Annotation of Histological and Bone Marrow Images
Due to ever increasing of medical images data in the world’s medical centers and recent developments in hardware and technology of medical imaging, necessity of medical data software analysis is needed. Equipping medical science with intelligent tools in diagnosis and treatment of illnesses has resulted in reduction of physicians’ errors and physical and financial damages. In this article we pr...
متن کاملDeveloping Guidelines and Ensuring Consistency for Chinese Text Annotation
With growing interest in Chinese Language Processing, numerous NLP tools (e.g. word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to the public, these tools are trained on the corpora with different segmentation criteria, part-of-speech tagsets and bracketing guidelines, and ther...
متن کاملInterlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation
This paper focuses on the next step in the creation of a system of meaning representation and the development of semantically-annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to provide parallel corpora annotated with detailed deep ...
متن کاملInterlingual annotation of parallel text corpora: a new framework for annotation and evaluation
This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically-annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language...
متن کاملA General Framework for Personalized Text Classification and Annotation
The tremendous volume of digital contents available today on the Web and the rapid spread of Web 2.0 sites, blogs and forums have exacerbated the classical information overload problem. Moreover, they have made even worse the challenge of finding new content appropriate to individual needs. In order to alleviate these issues, new approaches and tools are needed to provide personalized content r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Language Resources and Evaluation
سال: 2021
ISSN: ['1574-020X', '1574-0218']
DOI: https://doi.org/10.1007/s10579-021-09553-5